Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system

نویسندگان

  • Qing Zeng-Treitler
  • Sergey Goryachev
  • Scott T. Weiss
  • Margarita Sordo
  • Shawn N. Murphy
  • Ross Lazarus
چکیده

BACKGROUND The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease. METHODS The principal diagnosis, co-morbidity and smoking status extracted by HITEx from a set of 150 discharge summaries were compared to an expert-generated gold standard. RESULTS The accuracy of HITEx was 82% for principal diagnosis, 87% for co-morbidity, and 90% for smoking status extraction, when cases labeled "Insufficient Data" by the gold standard were excluded. CONCLUSION We consider the results promising, given the complexity of the discharge summaries and the extraction tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Clinical Use Case to Evaluate the i2b2 Hive: Predicting Asthma Exacerbations

To evaluate the i2b2 Hive as a tool to query, visualize, and extract clinical data, we selected a use case from the i2b2 airways diseases driving biology project: asthma exacerbations prediction. We analyzed the cohort selection and the extraction of the clinical data used by this asthma exacerbations prediction study. The structured data included the asthma diagnosis, birthdate, age, race, sex...

متن کامل

CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines

Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annota...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...

متن کامل

Viewpoint Paper: Identifying Patient Smoking Status from Medical Discharge Records

The authors organized a Natural Language Processing (NLP) challenge on automatically determining the smoking status of patients from information found in their discharge records. This challenge was issued as a part of the i2b2 (Informatics for Integrating Biology to the Bedside) project, to survey, facilitate, and examine studies in medical language understanding for clinical narratives. This a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • BMC Medical Informatics and Decision Making

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2006